In this example I will use QGIS geoprocessing scripts through the qgisprocess library in R. I will use data from the 2005 Peru Demographic and Health Survey, and data from the Peruvian government on the locations of secondary schools.
We use buffers from each DHS primary sampling unit location and point in polygon operations to measure whether a community had a secondary school within 5km.
Then, we use a hierarchical model to test whether a woman’s educational attainment is related to physical access to secondary schooling.
This is an example of a derived variable that cannot be obtained without the use of the GIS.
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(qgisprocess)
## Using 'qgis_process' at 'C://OSGeo4W64/bin/qgis_process-qgis.bat'.
## Run `qgis_configure()` for details.
library(tmap)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
qgis_configure()
## getOption('qgisprocess.path') was not found.
## Sys.getenv('R_QGISPROCESS_PATH') was not found.
## Trying 'qgis_process' on PATH
## Error in rethrow_call(c_processx_exec, command, c(command, args), stdin, : Command 'qgis_process' not found @win/processx.c:994 (processx_exec)
## Found 2 QGIS installations containing 'qgis_process':
## C://OSGeo4W64/bin/qgis_process-qgis.bat
## C://OSGeo4W64/bin/qgis_process-qgis-dev.bat
## Trying command 'C://OSGeo4W64/bin/qgis_process-qgis.bat'
## Success!
peru_dhs_points<-st_read("~/OneDrive - University of Texas at San Antonio/projects/LAPIVpaper/PEGE52FL/PEGE52FL.shp", "PEGE52FL")
## Reading layer `PEGE52FL' from data source `C:\Users\ozd504\OneDrive - University of Texas at San Antonio\projects\LAPIVpaper\PEGE52FL\PEGE52FL.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1414 features and 20 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -81.30633 ymin: -18.22461 xmax: 0 ymax: 0
## geographic CRS: WGS 84
peru_dhs_points<-st_transform(peru_dhs_points, crs=24892)
#project the data into a meter based system
peru_dhs_points<-peru_dhs_points%>%
filter(LATNUM<0)
So, right now we only have points (locations of sampling units), so we need a buffer around each point in order to do our assessment of whether a school is within 5km of each community. To do this we will do a fixed distance buffer of 5km around each sampling location.
Here I do a 5km buffer around each PSU location.
peru_buff<-st_buffer(peru_dhs_points, dist = 5000)
now we have our parameters defined, we run the script:
tmap_mode("view")
## tmap mode set to interactive viewing
tm_basemap("OpenStreetMap.Mapnik")+
tm_shape(peru_buff)+
tm_polygons("DHSCLUST")+
tm_shape(peru_dhs_points)+
tm_dots()